fix: ensure atomic creation of knowledge base with proper cleanup on failure #4406

Li-shi-ling · 2026-01-10T11:10:12Z

标题

fix: ensure atomic creation of knowledge base with proper cleanup on failure

描述

Motivation / 动机

修复了创建知识库时的事务一致性问题。Issue #4403：创建新的知识库实例时，如果没有提供embedding_provider_id，会导致创建了kb数据库记录但是没有创建kb_helper。当下次创建同一个名称时出现SQLite唯一约束冲突，但实际上这个数据库并没有被完整创建。

Modifications / 改动点

修改了 astrbot/core/knowledge_base/kb_mgr.py 中的 create_kb 方法
实现了参数预校验，确保创建知识库前提供必要的 embedding_provider_id
添加了同名知识库检查，避免重复创建
改进了异常处理机制，当KBHelper初始化失败时，使用同一个数据库会话回滚已创建的记录
确保数据一致性，避免留下不完整的数据库记录

修改后的关键代码逻辑：

# 1. 参数预校验
if embedding_provider_id is None:
    raise ValueError("创建知识库时必须提供embedding_provider_id")

# 2. 同名检查
existing_kb = await self.kb_db.get_kb_by_name(kb_name)
if existing_kb is not None:
    raise ValueError(f"知识库名称 '{kb_name}' 已存在")

# 3. 创建数据库记录后，如果KBHelper初始化失败，回滚
try:
    kb_helper = KBHelper(...)
    await kb_helper.initialize()
except Exception:
    # 使用同一个会话来删除，确保事务一致性
    await session.refresh(kb)
    await session.delete(kb)
    await session.commit()
    raise

验证步骤：

在1号示例,我展示了在触发KBHelper的initialize创建时出现报错,这个时候代码正常回滚删除没有创建完成的数据库,没有影响我的下一次同名创建

在2号示例,我先使用了没有embedding_provider_id参数的创建请求,正常抛出错误
然后使用了有embedding_provider_id参数的创建请求,成功创建
然后在下一次同名创建,正常抛出错误

我的测试插件代码(部分)

    @memorychain.command("kbep")
    async def get_ep(self, event: AstrMessageEvent):
        """获取ep列表"""
        ep_names = []
        p_ids = list(self.context.provider_manager.inst_map.keys())
        for p_id in p_ids:
            providers = self.context.get_provider_by_id(p_id)
            if isinstance(providers, EmbeddingProvider):
                ep_names.append(p_id)
        yield event.plain_result(f"能够使用的编码器列表:\n" + "\n".join(ep_names))
        logger.info(f"[memorychain] 能够使用的编码器列表:\n" + "\n".join(ep_names))


    @memorychain.command("kbcr")
    async def kb_create(self, event: AstrMessageEvent, kb_name:str, ep_names:str):
        """创建数据库"""
        await self.context.kb_manager.create_kb(
            kb_name = kb_name,
            embedding_provider_id = ep_names
        )
        yield event.plain_result(f"成功创建数据库:{kb_name}")
        logger.info(f"[memorychain] 成功创建数据库:{kb_name}")


    @memorychain.command("kbcr_cs")
    async def kb_create_cs(self, event: AstrMessageEvent, kb_name:str):
        """创建数据库"""
        await self.context.kb_manager.create_kb(
            kb_name = kb_name
        )
        yield event.plain_result(f"成功创建数据库:{kb_name}")
        logger.info(f"[memorychain] 成功创建数据库:{kb_name}")

Checklist / 检查清单

😊 如果 PR 中有新加入的功能，已经通过 Issue / 邮件等方式和作者讨论过。/ If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
- 没有加入新功能,只在原本处理逻辑上面加了更完善的处理
👀 我的更改经过了良好的测试，并已在上方提供了"验证步骤"和"运行截图"。/ My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
- 已在本地环境测试了各种边界情况
- 提供了详细的验证步骤
🤓 我确保没有引入新依赖库，或者引入了新依赖库的同时将其添加到了 requirements.txt 和 pyproject.toml 文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
- 本次修改仅修改现有逻辑，未引入新的依赖库
😮 我的更改没有引入恶意代码。/ My changes do not introduce malicious code.
- 代码修改集中在异常处理和事务一致性，不包含任何恶意代码

额外说明

修复方法遵循了Issue评论中dosubot的建议方案
使用了项目中已有的 get_kb_by_name() 方法，保持代码一致性
修改后的代码更加健壮，能够处理各种异常情况

Summary by Sourcery

通过验证输入、防止名称重复以及清理创建失败的记录，确保知识库实例创建的原子性和一致性。

Bug 修复：

当 KB helper 初始化失败时，防止不完整的知识库记录残留在数据库中。
通过要求必须提供 embedding provider，并在创建前拒绝重复的知识库名称，避免触发 SQLite 唯一约束错误。

增强内容：

增加对必需参数的预验证，并在知识库创建无法完成时提供明确的错误报告。

Original summary in English

Summary by Sourcery

Ensure atomic and consistent creation of knowledge base instances by validating inputs, preventing duplicate names, and cleaning up failed creations.

Bug Fixes:

Prevent incomplete knowledge base records being left in the database when KB helper initialization fails.
Avoid SQLite unique constraint violations by requiring an embedding provider and rejecting duplicate knowledge base names before creation.

Enhancements:

Add pre-validation of required parameters and explicit error reporting when knowledge base creation cannot be completed.

…failure - Added pre-validation for embedding_provider_id parameter - Added check for existing knowledge base with same name - Implemented proper rollback mechanism when KBHelper initialization fails - Uses same session for cleanup to ensure data consistency - Fixes AstrBotDevs#4403

sourcery-ai

Hey - 我发现了 3 个问题，并且给出了一些整体性的反馈：

在插入前对 existing_kb 做预检查会引入竞态窗口；建议依赖数据库层面对 kb_name 的唯一约束，并在违反唯一约束时捕获数据库的完整性错误，再将其映射为对用户友好的异常，而不是先单独查一次是否存在。
为了让整个操作保持完全原子性，可以考虑在 KBHelper 初始化之前避免提交：通过使用 session.flush() 获取 kb_id，只在 kb_helper.initialize() 成功之后再执行提交，而不是先提交、然后在失败时再删除并再次提交。
在异常处理代码块中，await session.refresh(kb) 在 delete(kb) 之前调用，在刚提交之后通常没有必要；你可以考虑移除这次 refresh 来简化清理逻辑。

面向 AI Agent 的提示词

Please address the comments from this code review:

## Overall Comments
- The pre-check for existing_kb before inserting introduces a race window; consider relying on a DB-level unique constraint for kb_name and mapping the integrity error to a user-facing exception instead of doing a separate existence query.
- To keep the operation fully atomic, you might avoid committing before KBHelper initialization by using `session.flush()` to obtain the kb_id and committing only after `kb_helper.initialize()` succeeds, rather than committing and then deleting/committing again on failure.
- Inside the exception handler, `await session.refresh(kb)` before `delete(kb)` seems unnecessary immediately after a commit; you can likely remove the refresh to simplify the cleanup logic.

## Individual Comments

### Comment 1
<location> `astrbot/core/knowledge_base/kb_mgr.py:100-101` </location>
<code_context>
+            raise ValueError("创建知识库时必须提供embedding_provider_id")
+
+        # 检查是否已存在同名知识库
+        existing_kb = await self.kb_db.get_kb_by_name(kb_name)
+        if existing_kb is not None:
+            raise ValueError(f"知识库名称 '{kb_name}' 已存在")
+
</code_context>

<issue_to_address>
**issue (bug_risk):** The name uniqueness check is vulnerable to race conditions under concurrent create calls.

This pre-insert check won’t prevent duplicates under concurrent creates: two requests can both see no existing KB and then both insert. If the DB enforces a unique index on `kb_name`, consider catching the resulting integrity error and mapping it to this `ValueError` so behavior remains correct and consistent under contention.
</issue_to_address>

### Comment 2
<location> `astrbot/core/knowledge_base/kb_mgr.py:117-126` </location>
<code_context>
+        kb_helper = None
</code_context>

<issue_to_address>
**suggestion:** The `kb_helper` sentinel and final RuntimeError appear to be unreachable and can be simplified away.

Because any construction error is re-raised in the `try/except` and successful initialization always assigns a non-None `kb_helper`, the `kb_helper = None` sentinel and the final `RuntimeError("知识库创建失败：未知错误")` are dead code. You can construct `kb_helper` inside the `try`, return it after the `with` block, and remove the extra check and fallback error.

Suggested implementation:

```python
        kb = KnowledgeBase(
            kb_name=kb_name,
            description=description,
            top_k_sparse=top_k_sparse if top_k_sparse is not None else 50,
            top_m_final=top_m_final if top_m_final is not None else 5,
        )

        async with self.kb_db.get_db() as session:
            session.add(kb)
            await session.commit()
            await session.refresh(kb)

```

Based on your comment, there is likely additional code in this function or method that:

1. Uses `kb_helper` after this block, e.g., something like:
   - A `try/except` that wraps the construction of `kb_helper`.
   - A `if kb_helper is None: raise RuntimeError("知识库创建失败：未知错误")` guard.

To fully apply your suggestion, you should also:

1. Move the construction of `kb_helper` inside the `try` block (or inside this `async with` block if appropriate for your logic), and ensure it always gets assigned on success.
2. Rely on the `try/except` to re-raise or translate any construction errors, so there is no need for a `None` sentinel.
3. Remove any `if kb_helper is None:` checks and the fallback `RuntimeError("知识库创建失败：未知错误")`, since they will be unreachable once `kb_helper` is always either:
   - Successfully created and returned, or
   - Throws an exception caught by your `except` block and re-raised.

You will need to apply these changes where the rest of the `kb_helper` logic is defined, which is not visible in the provided snippet.
</issue_to_address>

### Comment 3
<location> `astrbot/core/knowledge_base/kb_mgr.py:131-135` </location>
<code_context>
+                    chunker=CHUNKER,
+                )
+                await kb_helper.initialize()
+            except Exception:
+                await session.refresh(kb)
+                await session.delete(kb)
+                await session.commit()
+                raise
+        # 判断是否成功创建
</code_context>

<issue_to_address>
**suggestion (performance):** The extra `session.refresh(kb)` in the exception handler seems unnecessary and could be dropped.

Because `kb` is immediately deleted in the same session, the refresh doesn’t change the behavior and only adds an extra DB round-trip. Unless you specifically need the latest state before deleting, this call can be removed.

```suggestion
            except Exception:
                await session.delete(kb)
                await session.commit()
                raise
```
</issue_to_address>

Sourcery 对开源项目是免费的——如果你觉得这次 Review 有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进后续的 Review。}

Original comment in English

Hey - I've found 3 issues, and left some high level feedback:

The pre-check for existing_kb before inserting introduces a race window; consider relying on a DB-level unique constraint for kb_name and mapping the integrity error to a user-facing exception instead of doing a separate existence query.
To keep the operation fully atomic, you might avoid committing before KBHelper initialization by using session.flush() to obtain the kb_id and committing only after kb_helper.initialize() succeeds, rather than committing and then deleting/committing again on failure.
Inside the exception handler, await session.refresh(kb) before delete(kb) seems unnecessary immediately after a commit; you can likely remove the refresh to simplify the cleanup logic.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The pre-check for existing_kb before inserting introduces a race window; consider relying on a DB-level unique constraint for kb_name and mapping the integrity error to a user-facing exception instead of doing a separate existence query.
- To keep the operation fully atomic, you might avoid committing before KBHelper initialization by using `session.flush()` to obtain the kb_id and committing only after `kb_helper.initialize()` succeeds, rather than committing and then deleting/committing again on failure.
- Inside the exception handler, `await session.refresh(kb)` before `delete(kb)` seems unnecessary immediately after a commit; you can likely remove the refresh to simplify the cleanup logic.

## Individual Comments

### Comment 1
<location> `astrbot/core/knowledge_base/kb_mgr.py:100-101` </location>
<code_context>
+            raise ValueError("创建知识库时必须提供embedding_provider_id")
+
+        # 检查是否已存在同名知识库
+        existing_kb = await self.kb_db.get_kb_by_name(kb_name)
+        if existing_kb is not None:
+            raise ValueError(f"知识库名称 '{kb_name}' 已存在")
+
</code_context>

<issue_to_address>
**issue (bug_risk):** The name uniqueness check is vulnerable to race conditions under concurrent create calls.

This pre-insert check won’t prevent duplicates under concurrent creates: two requests can both see no existing KB and then both insert. If the DB enforces a unique index on `kb_name`, consider catching the resulting integrity error and mapping it to this `ValueError` so behavior remains correct and consistent under contention.
</issue_to_address>

### Comment 2
<location> `astrbot/core/knowledge_base/kb_mgr.py:117-126` </location>
<code_context>
+        kb_helper = None
</code_context>

<issue_to_address>
**suggestion:** The `kb_helper` sentinel and final RuntimeError appear to be unreachable and can be simplified away.

Because any construction error is re-raised in the `try/except` and successful initialization always assigns a non-None `kb_helper`, the `kb_helper = None` sentinel and the final `RuntimeError("知识库创建失败：未知错误")` are dead code. You can construct `kb_helper` inside the `try`, return it after the `with` block, and remove the extra check and fallback error.

Suggested implementation:

```python
        kb = KnowledgeBase(
            kb_name=kb_name,
            description=description,
            top_k_sparse=top_k_sparse if top_k_sparse is not None else 50,
            top_m_final=top_m_final if top_m_final is not None else 5,
        )

        async with self.kb_db.get_db() as session:
            session.add(kb)
            await session.commit()
            await session.refresh(kb)

```

Based on your comment, there is likely additional code in this function or method that:

1. Uses `kb_helper` after this block, e.g., something like:
   - A `try/except` that wraps the construction of `kb_helper`.
   - A `if kb_helper is None: raise RuntimeError("知识库创建失败：未知错误")` guard.

To fully apply your suggestion, you should also:

1. Move the construction of `kb_helper` inside the `try` block (or inside this `async with` block if appropriate for your logic), and ensure it always gets assigned on success.
2. Rely on the `try/except` to re-raise or translate any construction errors, so there is no need for a `None` sentinel.
3. Remove any `if kb_helper is None:` checks and the fallback `RuntimeError("知识库创建失败：未知错误")`, since they will be unreachable once `kb_helper` is always either:
   - Successfully created and returned, or
   - Throws an exception caught by your `except` block and re-raised.

You will need to apply these changes where the rest of the `kb_helper` logic is defined, which is not visible in the provided snippet.
</issue_to_address>

### Comment 3
<location> `astrbot/core/knowledge_base/kb_mgr.py:131-135` </location>
<code_context>
+                    chunker=CHUNKER,
+                )
+                await kb_helper.initialize()
+            except Exception:
+                await session.refresh(kb)
+                await session.delete(kb)
+                await session.commit()
+                raise
+        # 判断是否成功创建
</code_context>

<issue_to_address>
**suggestion (performance):** The extra `session.refresh(kb)` in the exception handler seems unnecessary and could be dropped.

Because `kb` is immediately deleted in the same session, the refresh doesn’t change the behavior and only adds an extra DB round-trip. Unless you specifically need the latest state before deleting, this call can be removed.

```suggestion
            except Exception:
                await session.delete(kb)
                await session.commit()
                raise
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/knowledge_base/kb_mgr.py

…ndition risks

Li-shi-ling · 2026-01-10T11:42:00Z

手顺把注释改了,现在改回去了🏃

Li-shi-ling · 2026-01-10T12:43:52Z

这个新x修改后的测试

auto-assign bot requested review from advent259141 and anka-afk January 10, 2026 11:10

dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. feature:knowledge-base The bug / feature is about knowledge base labels Jan 10, 2026

sourcery-ai bot reviewed Jan 10, 2026

View reviewed changes

astrbot/core/knowledge_base/kb_mgr.py Outdated Show resolved Hide resolved

astrbot/core/knowledge_base/kb_mgr.py Outdated Show resolved Hide resolved

astrbot/core/knowledge_base/kb_mgr.py Outdated Show resolved Hide resolved

Li-shi-ling added 2 commits January 10, 2026 19:30

fix: ensure atomic KB creation with session.flush() to remove race co…

b3b63a2

…ndition risks

fix: ensure change the annotation back

40cd208

Soulter approved these changes Jan 11, 2026

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Jan 11, 2026

Soulter merged commit 9a91f2f into AstrBotDevs:master Jan 11, 2026
6 checks passed

Soulter changed the title ~~fix: ensure atomic creation of knowledge base with proper cleanup on …~~ fix: ensure atomic creation of knowledge base with proper cleanup on failure Jan 11, 2026

Li-shi-ling deleted the fix/4403-kb-creation-transaction branch January 12, 2026 01:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: ensure atomic creation of knowledge base with proper cleanup on failure #4406

fix: ensure atomic creation of knowledge base with proper cleanup on failure #4406

Li-shi-ling commented Jan 10, 2026 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Li-shi-ling commented Jan 10, 2026

Uh oh!

Li-shi-ling commented Jan 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

fix: ensure atomic creation of knowledge base with proper cleanup on failure #4406

fix: ensure atomic creation of knowledge base with proper cleanup on failure #4406

Conversation

Li-shi-ling commented Jan 10, 2026 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

标题

描述

Motivation / 动机

Modifications / 改动点

Checklist / 检查清单

额外说明

Summary by Sourcery

Summary by Sourcery

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Li-shi-ling commented Jan 10, 2026

Uh oh!

Li-shi-ling commented Jan 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Li-shi-ling commented Jan 10, 2026 •

edited by sourcery-ai bot

Loading